iterative regularization
Beyond Tikhonov: faster learning with self-concordant losses, via iterative regularization
The theory of spectral filtering is a remarkable tool to understand the statistical properties of learning with kernels. For least squares, it allows to derive various regularization schemes that yield faster convergence rates of the excess risk than with Tikhonov regularization. This is typically achieved by leveraging classical assumptions called source and capacity conditions, which characterize the difficulty of the learning task. In order to understand estimators derived from other loss functions, Marteau-Ferey et al. have extended the theory of Tikhonov regularization to generalized self concordant loss functions (GSC), which contain, e.g., the logistic loss. In this paper, we go a step further and show that fast and optimal rates can be achieved for GSC by using the iterated Tikhonov regularization scheme, which is intrinsically related to the proximal point method in optimization, and overcomes the limitation of the classical Tikhonov regularization.
Beyond Tikhonov: faster learning with self-concordant losses, via iterative regularization
The theory of spectral filtering is a remarkable tool to understand the statistical properties of learning with kernels. For least squares, it allows to derive various regularization schemes that yield faster convergence rates of the excess risk than with Tikhonov regularization. This is typically achieved by leveraging classical assumptions called source and capacity conditions, which characterize the difficulty of the learning task. In order to understand estimators derived from other loss functions, Marteau-Ferey et al. have extended the theory of Tikhonov regularization to generalized self concordant loss functions (GSC), which contain, e.g., the logistic loss. In this paper, we go a step further and show that fast and optimal rates can be achieved for GSC by using the iterated Tikhonov regularization scheme, which is intrinsically related to the proximal point method in optimization, and overcomes the limitation of the classical Tikhonov regularization.
Iterative regularization in classification via hinge loss diagonal descent
Apidopoulos, Vassilis, Poggio, Tomaso, Rosasco, Lorenzo, Villa, Silvia
Estimating a quantity of interest from finite measurements is a central problem in a number of fields including machine learning but also statistics and signal processing. In this context, a key idea is that reliable estimation requires imposing some prior assumptions on the problem at hand. The theory of inverse problems provides a principled framework to formalize this idea [27]. The quantity of interest is typically seen as a function, or a vector, and prior assumptions take the form of suitable functionals, called regularizers. Following this idea, Tikhonov regularization provides a classic approach to estimate solutions [83, 84]. Indeed, the latter are found by minimizing an empirical objective where a data fit term is penalized adding the chosen regularizer. Other regularization approaches are classic in inverse problems, and in particular iterative regularization has become popular in machine learning, see e.g.
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- Europe > Italy > Liguria > Genoa (0.04)
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
- North America > United States > California > San Mateo County > Menlo Park (0.04)
- Information Technology > Artificial Intelligence > Representation & Reasoning > Optimization (1.00)
- Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.68)
- Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning > Support Vector Machines (0.47)
Deep unfolding as iterative regularization for imaging inverse problems
Cui, Zhuo-Xu, Zhu, Qingyong, Cheng, Jing, Liang, Dong
Recently, deep unfolding methods that guide the design of deep neural networks (DNNs) through iterative algorithms have received increasing attention in the field of inverse problems. Unlike general end-to-end DNNs, unfolding methods have better interpretability and performance. However, to our knowledge, their accuracy and stability in solving inverse problems cannot be fully guaranteed. To bridge this gap, we modified the training procedure and proved that the unfolding method is an iterative regularization method. More precisely, we jointly learn a convex penalty function adversarially by an input-convex neural network (ICNN) to characterize the distance to a real data manifold and train a DNN unfolded from the proximal gradient descent algorithm with this learned penalty. Suppose the real data manifold intersects the inverse problem solutions with only the unique real solution. We prove that the unfolded DNN will converge to it stably. Furthermore, we demonstrate with an example of MRI reconstruction that the proposed method outperforms conventional unfolding methods and traditional regularization methods in terms of reconstruction quality, stability and convergence speed.
- Asia > China > Guangdong Province > Shenzhen (0.05)
- North America > United States > New York (0.04)
- Europe > Russia (0.04)
- Asia > Russia (0.04)
Iterative regularization for low complexity regularizers
Molinari, Cesare, Massias, Mathurin, Rosasco, Lorenzo, Villa, Silvia
Iterative regularization exploits the implicit bias of an optimization algorithm to regularize ill-posed problems. Constructing algorithms with such built-in regularization mechanisms is a classic challenge in inverse problems but also in modern machine learning, where it provides both a new perspective on algorithms analysis, and significant speed-ups compared to explicit regularization. In this work, we propose and study the first iterative regularization procedure able to handle biases described by non smooth and non strongly convex functionals, prominent in low-complexity regularization. Our approach is based on a primal-dual algorithm of which we analyze convergence and stability properties, even in the case where the original problem is unfeasible. The general results are illustrated considering the special case of sparse recovery with the $\ell_1$ penalty. Our theoretical results are complemented by experiments showing the computational benefits of our approach.
- North America > United States > New York (0.04)
- Europe > Russia (0.04)
- Asia > Russia (0.04)
- Asia > Middle East > Jordan (0.04)
Iterative regularization for convex regularizers
Molinari, Cesare, Massias, Mathurin, Rosasco, Lorenzo, Villa, Silvia
Machine learning often reduces to estimating some model parameters. This approach raises at least two orders of questions: first, multiple solutions may exist, amongst which a specific one must be selected; second, potential instabilities with respect to noise and sampling must be controlled. A classical way to achieve both goals is to consider explicitly penalized or constrained objective functions. In machine learning, this leads to regularized empirical risk minimization (Shalev-Shwartz and Ben-David, 2014). A more recent approach is based on directly exploiting an iterative optimization procedure for an unconstrained/unpenalized problem. This approach is shared by several related ideas. One is implicit regularization (Mahoney, 2012; Gunasekar et al., 2017), stemming from the observation that the bias is controlled increasing the number of iterations, just like in penalized methods it is controlled decreasing the penalty parameter. Another one is early stopping (Yao et al., 2007; Raskutti et al., 2014), putting emphasis on the fact that running the iterates to convergence might lead to instabilities in the presence of noise. Yet another, and more classical, idea is iterative regularization, where both aspects (convergence and stability) are considered to be relevant (Engl et al., 1996; Kaltenbacher et al., 2008).
- North America > United States > New York (0.04)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
Iterative Regularization for Learning with Convex Loss Functions
Lin, Junhong, Rosasco, Lorenzo, Zhou, Ding-Xuan
We consider the problem of supervised learning with convex loss functions and propose a new form of iterative regularization based on the subgradient method. Unlike other regularization approaches, in iterative regularization no constraint or penalization is considered, and generalization is achieved by (early) stopping an empirical iteration. We consider a nonparametric setting, in the framework of reproducing kernel Hilbert spaces, and prove finite sample bounds on the excess risk under general regularity conditions. Our study provides a new class of efficient regularized learning algorithms and gives insights on the interplay between statistics and optimization in machine learning.
- North America > United States > Massachusetts > Middlesex County > Cambridge (0.14)
- Europe > United Kingdom > England > Cambridgeshire > Cambridge (0.04)
- North America > United States > District of Columbia > Washington (0.04)
- (4 more...)
- Research Report > New Finding (0.46)
- Research Report > Experimental Study (0.34)